Update Friendly Evaluator by w-javed · Pull Request #46076 · Azure/azure-sdk-for-python

w-javed · 2026-04-02T06:29:15Z

No description provided.

github-actions · 2026-04-02T06:57:18Z

API Change Check

APIView identified API level changes in this PR and created the following API reviews

azure-ai-projects

...ure-ai-projects/samples/evaluations/custom_evaluators/friendly_evaluator/common_util/util.py

...e-ai-projects/samples/evaluations/custom_evaluators/friendly_evaluator/friendly_evaluator.py

...ure-ai-projects/samples/evaluations/custom_evaluators/friendly_evaluator/common_util/util.py

Update the FriendlyEvaluator sample to return the new standard output format with score, label, reason, threshold, and passed at the top level. Extra evaluator output fields (explanation, tone, confidence) are nested under a properties dict. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Use 'from openai import OpenAI' instead of AzureOpenAI - Accept api_key and model params instead of model_config dict - Use client.responses.create() instead of chat.completions.create() - Update util.py: split build_evaluation_messages into build_evaluation_instructions() and build_evaluation_input() - Update sample init_parameters schema accordingly Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Address aprilk-ms review: annotate which fields in the evaluation result dict are required vs optional for the evaluation service. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Align sample_eval_upload_friendly_evaluator.py with the updated FriendlyEvaluator that takes api_key and model instead of deployment_name/model_config. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Merge sample_custom_evaluator_friendly_evaluator.py into sample_eval_upload_friendly_evaluator.py so the sample first runs FriendlyEvaluator locally, then uploads, creates eval, and runs it. Fix model_name parameter to match evaluator __init__ signature. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dargilco · 2026-04-03T13:43:46Z

Reminder to run 'black' tool. Thanks!
black --config ../../../eng/black-pyproject.toml .

dargilco · 2026-04-03T13:45:51Z

Regarding the MyPy error, note that Azure SDK has recently updated their tools. They no longer use "tox". This is the new command to run MyPy: azpysdk mypy . (see https://github.com/Azure/azure-sdk-for-python/blob/main/doc/dev/tests.md#running-checks-locally)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

aprilk-ms · 2026-04-03T18:42:46Z

...cts/samples/evaluations/custom_evaluators/answer_length_evaluator/answer_length_evaluator.py

threshold not needed? what is config?

I guess, I can remove config from this simple evaluator. And just pass threshold of 50 characters. If response length is higher in length, do pass or fail. That would be simple sample.

aprilk-ms · 2026-04-03T18:52:49Z

...ure-ai-projects/samples/evaluations/custom_evaluators/friendly_evaluator/common_util/util.py

            "reason": result.get("reason", "No reason provided"),
-            "explanation": result.get("explanation", "No explanation provided"),
+            "threshold": threshold,
+            "passed": passed,


Thinking more - I prefer to just mention passed can be calculated in the evaluator logic as a comment, but we don't actually implement it (or maybe comment that out). I hope this is an unusual case, and user will setup threshold/default/direction in the evaluator metadata and let us do the calculation.

aprilk-ms · 2026-04-03T18:54:55Z

sdk/ai/azure-ai-projects/samples/evaluations/sample_eval_upload_friendly_evaluator.py

-         folder structure (common_util/) using `evaluators.upload()`.
-      2. Create an evaluation (eval) that references the uploaded evaluator.
-      3. Run the evaluation with inline data and poll for results.
+      1. Run the FriendlyEvaluator standalone to verify it works locally.


The file name is a bit weird. Can we replace friendly with what we are trying to demonstrate? We have 2 samples, maybe basic and advanced if it hard to be more specific?

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

w-javed requested review from bobogogo1990, dargilco, glharper, howieleung, kingernupur, nick863, trangevi and trrwilson as code owners April 2, 2026 06:29

github-actions bot added the AI Projects label Apr 2, 2026

YoYoJa approved these changes Apr 2, 2026

View reviewed changes